perm filename B06.TEX[162,RWF] blob
sn#750187 filedate 1984-04-06 generic text, type C, neo UTF8
COMMENT ⊗ VALID 00005 PAGES
C REC PAGE DESCRIPTION
C00001 00001
C00002 00002 \rm
C00004 00003 The distribution $n(x)=A e↑{-Bx↑2}$ must have $M↓0 = 1$ to be a probability
C00007 00004
C00008 00005 \vskip .5in
C00011 ENDMK
C⊗;
\rm
\def\today{\ifcase\month\or
January\or February\or March\or April\or May\or June\or
July\or August\or September\or October\or November\or December\fi
\space\number\day, \number\year}
\line{\sevenrm 162B06.tex[v1,rwf] \today\hfill}
\noindent
CS 162
\noindent
Robert W. Floyd
\noindent
January 1984
\vskip .2in
\noindent
{\bf Reference sheet on normal distributions}
$$\eqalign{\int x↑n e↑{-Bx↑2}dx
&= -{1\over 2B} \int x↑{n-1} d\left( e↑{-Bx↑2}\right)\cr
&= -{1\over 2B} \left( x↑{n-1} e↑{-Bx↑2} - \int e↑{-Bx↑2}
d\left( x↑{n-1}\right)\right)\cr
&= -{1\over 2B}x↑{n-1} e↑{-Bx↑2} +{n-1\over 2B}\int x↑{n-2}e↑{-Bx↑2}dx \cr}$$
\vskip 0.1in
\hrule
\vskip 0.1in
$$\eqalign{ \int x e↑{-Bx↑2}dx & ={1\over 2} \int e↑{-Bx↑2} d\left( x↑2\right) \cr
& =-{1\over 2B}
\int d \left( e↑{-Bx↑2}\right) \cr & =-{1\over 2B} e↑{-Bx↑2}\cr}$$
\vskip 0.1in
\hrule
\vskip 0.1in
Find $I = \int↑∞↓{-∞} e↑{-Bx↑2}dx$.
$$\eqalign{ I↑2 & = \int↑∞↓{-∞}\int↑∞↓{-∞} e↑{-Bx↑2} e↑{-By↑2} dy dx
= \int\int e↑{-B(x↑2+y↑2)}dy dx\cr
&=\int↑{2π}↓0\int↑∞↓0 e↑{-Br↑2} r dr d\theta = 2π \int↑∞↓0 r e↑{-Br↑2}dr
= {π\over B} \left[ \left. -e↑{-Br↑2}\right] \right| ↑∞↓0 = {π\over B}\quad,\cr}$$
so $I = \sqrt{π\over B}$.
The distribution $n(x)=A e↑{-Bx↑2}$ must have $M↓0 = 1$ to be a probability
distribution; $$\int↑∞↓{-∞} A e↑{-Bx↑2} = 1,\quad A\sqrt{π\over B} = 1,\quad
A = \sqrt{B\over π},$$ so $n(x) = \sqrt{B\over π} e↑{-Bx↑2}$. The mean is
obviously zero. The variance is
$$\eqalign{M↓2 &= \sqrt{B\over π} \int↑∞↓{-∞} x↑2
e↑{-Bx↑2}dx = \sqrt{B\overπ}
\left\{ \left. \left[ -{1\over 2B} x e↑{-Bx↑2}\right]\right| ↑∞↓{-∞} +
{1\over 2B} \int↑∞↓{-∞} e↑{-Bx↑2}dx \right\}\cr
&= \sqrt{B\overπ}
\left\{ 0+{1\over 2B}
\sqrt{π\over B}\right\} = {1\over2B}\cr}$$
so $ \sigma = \sqrt V = \sqrt{1\over2B}$;
inverting, $B ={1\over2\sigma↑2}$, and $n(x)$, with standard deviation $\sigma$, is
$${1\over\sigma}\sqrt{1\over2π} e↑{-{1\over2}\left( x/\sigma \right)↑2}.$$
\vfill \eject
What is
the chance that a number drawn from a normal distribution is greater than $C\sigma$?
The formula is
$$f(C) ={1\over\sigma}\sqrt{1\over2π} \int↑∞↓{C\sigma} e↑{-{1\over2}\left( x/\sigma
\right)↑2}{dx} = \sqrt{1\over2π}\int↑∞↓C e↑{-x↑2/2}dx.$$ This cannot be
integrated in closed form, although it is tabulated in many handbooks. For
$C\gg 1$, however, it can be closely approximated. By a change of variable,
$$\eqalign{f(C) &= \sqrt{1\over2π} \int↑∞↓0 e↑{-{1\over2}(y+C)↑2}dy
= \sqrt{1\over2π}e↑{-C↑2/2}
\int↑∞↓0 e↑{-{1\over2}y↑2} e↑{-Cy} dy\cr
&< \sqrt{1\over2π}e↑{-C↑2/2} \int↑∞↓0 e↑{-Cy}
dy = \sqrt{1\over2π}\cdot {1\over C} e↑{-C↑2/2}\cr}.$$
For $C = 1, 2, 3,\, f(C)$
and the above bound are:
\halign{\qquad #\hfil \quad & #\hfil \quad & #\hfil\cr
$C$&${f(C)}$ &upper bound\cr
1 &{.1586} &{.241}\cr 2 &{.0227} &{.0270}\cr 3 &{.00135} &{.00148}\cr}
\noindent
(For a closer approximation, approximate $e↑{-{1\over2}y↑2}$ not by $1$, but by
$1 - y↑2/2$).
\vskip .5in
\noindent
Obsolete Section
If $d$ is any distribution, by taking its convolution with
itself $k$ times $(d\circ k)$, we get a nearly normal distribution; this is
the {\it central limit theorem} of statistics. If we know the mean and variance
of $d,$ and $k$ is large, $d\circ k$ can be well approximated by the normal
distribution with mean $kM(d)$ and variance $kV(d)$; this is the distribution:
$$n(x) ={1\over\sqrt{2πkV(d)}} e↑{-(x-kM(d))↑2/(2kV(d))}.$$ In
particular, if d is the coin-flipping distribution $d(0) = d(1) ={1\over 2}$,
with $ M(d) = {1\over2}$, $V(d)
= {1\over4}$, $(d\circ k)(i) = 2↑{-k}{k\choose i}$.
By the central limit theorem, this does not differ much from $${1\over\sqrt k}
\sqrt{2\overπ} e↑{-2\left( i-k/2)↑2/k \right)}.$$ We can then approximate
the binomial coefficient $${k\choose i} \approx 2↑k{1\over\sqrt k}\sqrt{2\overπ}
e↑{-2(i-k/2)↑2/k};$$ if $i = k/2$, we get $${k\choose k/2}\approx 2↑k \cdot
{1\over\sqrt k} \cdot \quad 0.79788\quad.$$
As an illustration, ${20\choose 10}= 184756$;
the approximation gives $187077$, an error of $1.2\%$. If $i$ is not close
to $k/2$, the relative error is not as good, although the absolute error
is smaller; ${20\choose 4} = 4845$; the approximation gives $5111.7$, an
error of $5.5\%.$
\vfill \end